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In recent times there has been a surge of interest in seeking out patterns in the aggregate behavior 
of socio-economic systems. One such domain is the emergence of statistical regularities in the 
evolution of collective choice from individual behavior. This is manifested in the sudden emergence 
of popularity or "success" of certain ideas or products, compared to their numerous, often very 
similar, competitors. In this paper, we present an empirical study of a wide range of popularity 
distributions, spanning from scientific paper citations to movie gross income. Our results show that 
in the majority of cases, the distribution follows a log-normal form, suggesting that multiplicative 
stochastic processes are the basis for emergence of popular entities. This suggests the existence 
of some general principles of complex organization leading to the emergence of popularity. We 
discuss the theoretical principles needed to explain this socio-economic phenomenon, and present a 
model for collective behavior that exhibits bimodality, which has been observed in certain empirical 
popularity distributions. 

PACS numbers: 89.75.-k,05.65.+b,89.65.-s 



I. INTRODUCTION 

hit {noun) a person or thing that is successful 

popular (adj.), from Latin popularis, from 

populus: the people, a people 

1: of or relating to the general public, 

2: suitable to the majority: as (a) adapted to 

or indicative of the understanding and taste of 

the majority, (b) suited to the means of the 

majority: inexpensive, 

3: frequently encountered or widely accepted, 
4: commonly liked or approved 

Memam- Webster Online Dictionary . 

In a pioneering study of how apparently rational peo- 
ple can behave irrationally as part of a crowd, Charles 
MacKay [l[ had given several illustrations of certain phe- 
nomena becoming wildly popular without discernible rea- 
son. In fact, he had focussed specifically on examples 
where the individuals were behaving clearly contrary to 
their self-interest or that of society as a whole, as for 
example, the habit of duelling or the practise of witch- 
hunting. MacKay termed these episodes "moral epi- 
demics" , long before the formal introduction of the con- 
cept of social contagion Q and the use of biological epi- 
demic models to study such phenomena, ascribing their 
origin to the nature of men to imitate the behavior of 
their neighbors. 

However, such herding behavior 2 is not limited to the 
examples given in MacKay's book, nor do the outcomes of 
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2 MacKay referred to such behavior as "gregarious" , in its original 
sense of "to flock". 



such behavior need to be so dramatic in their impact as, 
say, financial market crashes or publicly sanctioned geno- 
cides. In fact, the sudden emergence of a popular prod- 
uct or idea, that is otherwise indistinguishable in quality 
from its competitors, is a more common example of the 
same process at work. These events occur so often that 
we take such phenomena for granted; however, the ques- 
tion of why certain products or ideas become much more 
popular than what their intrinsic quality would warrant 
remains a fascinatingand unanswered problem in the so- 
cial sciences. Watts Q points this out when he says ". . . 
for every Harry Potter and Blair Witch Project that ex- 
plodes out of nowhere to capture the public's attention, 
there are thousands of books, movies, authors and ac- 
tors who live their entire inconspicuous lives beneath the 
featureless sea of noise that is modern popular culture." 

It may be worth mentioning that such popularity may 
be of different kinds, one being runaway popularity im- 
mediately upon release, and, another being modest ini- 
tial popularity followed by ever-increasing popularity in 
subsequent periods. The former is thought to be driven 
by the advertising blitz preceding the release or launch 
of the product while the latter has sometimes been ex- 
plained in terms of self-reinforcing effects, where a slight 
relative edge in terms of initial popularity results in more 
consumers being inclined towards the slightly more pop- 
ular product, thereby increasing its popularity even fur- 
ther and so on, driving up its popularity through positive 
feedback. 

As physicists we are naturally interested to see whether 
there are general trends that can be observed in popular- 
ity phenomena across a large range of contexts in which 
they are observed. An allied question is whether this 
popularity can be related to any of the intrinsic prop- 
erties of the products or ideas, or whether this is en- 
tirely an outcome of a sequence of chance events. The 
fact that often popular products are seen to be not all 
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that qualitatively different from their competitors, or in 
some cases, actually somewhat inferior, seems to weigh 
against the former possibility. However, we would like to 
see whether the empirically observed popularity distribu- 
tions also suggests the latter alternative. We also need 
to see whether pre-release advertising does indeed play a 
role in creating a high initial burst of popularity. 

In this article, we first approach the problem empiri- 
cally, looking at previous work done on measuring pop- 
ularity distributions, as well as presenting some of our 
recent analysis of the popularity phenomena occurring 
in a variety of different contexts. One remarkable uni- 
versality we find is that most popularity distributions we 
examine seem to have long tails, and can be fit either by a 
log-normal or a power-law probability distribution func- 
tion, the exponent of the latter often being quite close to 
—2. Another interesting feature observed for some dis- 
tributions is their bimodal character, with the majority 
of instances occurring at extreme ends of the distribu- 
tion, while the center of the distribution is remarkably 
under-represented. Both of these features indicate a sig- 
nificant departure from the Gaussian distribution that 
may have been naively expected. Next, we survey possi- 
ble theoretical models for explaining the above features 
of the empirical distributions. In particular, we discuss 
how log-normal distributions can arise through several 
agents making independent decisions in choosing from a 
range of products with randomly distributed qualities. 
We also present a model of agent-agent interaction that 
shows a transition from unimodal to bimodal distribu- 
tion of the collective choice, when agents are allowed to 
learn from their previous experience. We conclude with 
a short discussion on how log-normal and power-law tail 
distributions can be generated from the same theoreti- 
cal framework, the former occurring when agents choose 
independently of other agents (basing their decisions on 
individual perceptions of quality) and the latter emerg- 
ing when agent-agent interactions are crucial in deciding 
the desirability of a product. 



II. EMPIRICAL POPULARITY 
DISTRIBUTIONS 

In studying the popularity distribution of products, 
the first question one needs to resolve is how to measure 
popularity. While in some cases this may seem rather 
obvious, e.g., the number of people buying a particu- 
lar book, in other cases it may be difficult to identify 
a unique measure that will satisfy everyone. For exam- 
ple, the popularity of movies can be measured either in 
terms of an average over critics' opinions published in 
major periodicals, web-based voting in movie-related on- 
line communities, the income generated when a movie is 
running in theaters, or the cumulative sales and rentals 
from DVD stores. In most cases, we have let the quality 
of the available data decide our choice of which popular- 
ity measure to use. 



An equally important question one needs to answer is 
the nature of the statistical distribution with which to fit 
the data. In almost all cases reported below, we observe 
distributions that deviate significantly from the Gaussian 
distribution in having extremely long tails. The occur- 
rence of such fat-tailed distributions in so many instances 
is very exciting, as it indicates that the process of emer- 
gence of popular products is more than just N agents 
independently making single binary (i.e., yes or no) deci- 
sions to adopt a particular choice. However, to go beyond 
this conclusion and to identify the possible process in- 
volved, one needs to ascertain accurately the true nature 
of the distribution. This brings up the question of how 
to obtain the probability density function (PDF) from 
the empirical data. The method generally used is to ar- 
range the data into a suitable number of bins to obtain 
a histogram, which in an appropriate limit will provide 
the PDF. This works fine when the underlying distribu- 
tion is Gaussian with sharply decaying tails; however, for 
long-tailed distributions, it is exactly the extreme ends 
one is interested in, which have the least representation 
in the data. As a result, the PDF is extremely noisy 
at the tails, and hence, it is often hard to conclude the 
nature of the distribution. Often, one can remove some 
of the noise by using the PDF to generate the cumula- 
tive distribution function (CDF), which is essentially the 
probability that an event is larger than a given size 3 . As 
larger quantities of data points are now accumulated in 
each of the bins, the tail becomes smoother in the CDF 
plot. However, the data binning process is susceptible 
to noise, that can change significantly the shape of the 
distribution, depending on the size and boundary values 
of each bin. This can lead to serious errors, e.g., wrongly 
identifying the tail of the distribution to be following a 
power law. Even if the distribution indeed has a power- 
law tail, one may obtain a quantitatively erroneous value 
for the power-law exponent by using graphical methods 
based on linear least square fitting on a double logarith- 
mic scale [11 . 

A better way to examine the nature of the tail of a 
distribution is to avoid binning altogether and to switch 
to a rank-ordered plot of the data, which allows one to 
focus on the upper tail of the distribution containing the 
data points of largest magnitude. These plots are of- 
ten referred to as Zipf plots, after the Harvard linguist, 
G. K. Zipf, who used such rank- frequency plots of the 
occurrence of the most common words in the English 
language to establish a scaling relation for written nat- 
ural languages [a, [fa]- In this procedure, the data points 
are ranked or arranged in decreasing order of their magni- 
tude. Note that the CDF can be obtained from the rank- 
ordered plot by simply exchanging the abscissae and the 
ordinate, and suitably scaling the axes. Thus, by avoid- 



3 The CDF, P c (x), of a given process is obtained by integrating 
the corresponding PDF, P(x), i.e., P c (x) = J°° P(x')dx' . 



3 



ing binning one can make a better judgement of the na- 
ture of the distribution. To quantitatively determine the 
parameters of the distribution, one of the most robust 
methods is maximum likelihood estimation (MLE) 
For example, if the underlying distribution P c (x) has a 
power-law tail, then the CDF exponent can be obtained 
from the MLE method by using the formula 
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where, x m i n corresponds to the minimum value of x for 
which the power-law behaviour holds. Similarly, one can 
obtain maximum likelihood estimates of the parameters 
for log-normal and other distributions. 

It is, of course, obvious that the results from the three 
different plots, namely, the PDF, the CDF and the rank- 
ordered, should be related to each other. So, for exam- 
ple, if the CDF of an empirically obtained distribution is 
found to exhibit a power-law tail which can be expressed 
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with the characteristic exponent 4 a, it is easy to show 
that the PDF and the rank-ordered plots will also exhibit 
power-law behavior [9J. Moreover, the exponents of the 
power-law seen in these two cases will be related to the 
characteristic exponent of the CDF, a, as follows: the 
PDF will follow the relation, 
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while, the rank-ordered plot will exhibit the relation, 

x k ~ (4) 

where Xk denotes the k-th ranked data point. The above 
examples are all given for the case when the underlying 
distribution has a power-law tail; similar relations can 
be derived for other underlying distributions, e.g., log- 
normal. 

A. Examples 



those cases where the data is available only for the upper 
tail of the distribution, such a procedure is not possi- 
ble. In these cases, we have presented a rank-ordered 
plot of the data and have tried to fit a power-law char- 
acterized by the CDF exponent, a. In this context, we 
note that most previous observations of popularity dis- 
tributions had focussed on the upper tail, and fitted a 
power-law on this. However, we find that the entire dis- 
tribution is v ery often a much better fit to the log-normal 
distribution [Hj] • We conclude with a brief discussion of 
why data that fit log-normal much better has often been 
reported in the literature to follow a power-law tail. 



1. City Size. 

Possibly the first ever empirical observation of a long- 
tailed popularity distribution is that of cities, as mea- 
sured by their population, which was first proposed in 
1913 by Auerbach [ll[. Later, this basic idea was refined 
by many others, most notably Zipf Q. In fact, the last 
mentioned work has become so well-known that, often 
the term Zipf 's law is used to refer to the idea that city 
sizes follow a cumulative probability distribution having 
a power-law tail [l2| with exponent a = 1. Over the 
years, several empirical studies have been published in 
support of the validity of Zipf's law [HI]. However, other 
empirical studies have found significant deviations from 
the exact form given by Zipf [141 ]. In a recent review, 
the combined estimate of the exponent a from 29 dif- 
ferent studies is found to be significantly larger than 1 
suggesting a less extended tail than implied by a strict 
interpretation of Zipf's law [l5(. All these studies have 
focused on the upper tail (i.e., larger cities) of the dis- 
tribution. If one also considers the smaller cities, the 
whole distribution often fits a double-Pareto log-normal, 
i.e., a distribution which is lo g-no rmal in the bulk but 
has long tails at the two ends [Ty] . Even the power-law 
fit of the tail has itself been called into question by a 
study of the size distribution of US cities over the period 
1900-1990 17]. These results are of special significance 
to our study, as it shows that the fat-tailed distribution 
of popularity of cities need not be a power-law but could 
be explained by other distributions. 



In the following paragraphs we have briefly surveyed 
previous empirical work on popularity distribution, as 
well as, presented some of our own recent analysis of pop- 
ularity data from a broad variety of contexts. In most 
cases, we have characterized the empirical CDF with a 
log-normal fit over the entire distribution. However, in 



4 This exponent a is often referred to as the Pareto exponent, 
after the Italian economist, V. Pareto, who was the first to report 
power law tails for the CDF of income distribution across several 
European countries [§[. 



2. Company Size. 

Almost of similar vintage to the city size literature is 
the work on company size, measured in terms of sales 
or employees. Note that, both of these are measures of 
popularity of the company, the former measuring its pop- 
ularity among the consumers of its products, while the 
latter measures its popularity in the labor market. In 
1932, Gibrat formulated the law of proportional growth, 
essentially a multiplicative stochastic process for explain- 
ing company growth, which predicts that the distribu- 
tion of firm size would follow a log-normal distribu- 
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tion [ill [TH . While this has indeed been reported from 
empirical data [2(1 l2l| , there have been also reports of a 
power-law tail [22]. In particular, Axtell [23] has looked 
at the size of US companies (listed in the U.S. Census Bu- 
reau database) in terms of the number of employees, that 
yields a CDF with power law tail whose exponent a ~ 1. 
When the size was expressed in terms of receipts (in dol- 
lars) this also yielded a power law CDF with a ~ 0.99. 

3. Scientists and Scientific Papers. 

The study of popularit y in the field of science has a 
rich and colorful history [24]]. One of the earliest such 
studies is that on the visibility of scientists, as measured 
by subjective opinions elicited from a sample of the scien- 
tific community (25| . The skewed nature of the visibility 
because of misallocation of credit in the field of science, 
where an already famous scientist gets more credit than 
is due compared to less well-known colleagues, has been 
termed as the Mathew effect [26] . This is quite similar to 
the unequal degree of popularity seen in show-business 
professions, e.g., among movie actors and singers. A 
more objective measure for the popularity of scientists 
is the total number of citations to their papers (2?| ■ 

The popularity of individual scientific papers can also 
be analysed in terms of citations to them [28[ ■ Price [2^] 
had tried to give a theoretical model based on cumu- 
lative advantage along with supporting evidence show- 
ing that the distribution of citations to papers follow a 
power-law tail. More recently, in a study [30] analyzing 
papers in the Institute for Scientific Information (ISI) 
database, as well as papers published in Physical Review 
D, Redner concluded that the probability distribution of 
citations follow a power law tail with an exponent close 
to —3. However, in a later work looking at all papers 
published in Physical Review journals over the past 110 
years, this distribution was found to be fit better by a 
log-normal [3l| (Fig. [TJ inset). 

In addition to the popularity of individual papers mea- 
sured by the number of their citations, one can also define 
the popularity of the journals in which these papers are 
published by considering the total number of citations 
to all articles published in a journal. In Fig. [TJ we have 
plotted the cumulative distribution of the total citations 
in 1997-99 to all papers ever published in a journal. The 
data has been fit with a log-normal distribution; max- 
imum likelihood estimates of parameters for the corre- 
sponding distribution are \i = 6.37 and <j = 1.75. 

4- Newspaper and Magazines. 

The popularity of scientific journals naturally leads us 
to wonder about the popularity distribution for general 
interest magazines as well as newspapers. An obvious 
measure of popularity in this case is the circulation fig- 
ure. Fig. [3 shows the CDF of the top 740 magazines ac- 
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FIG. 1: The cumulative distribution function for the total 
number of citations to a journal in a given year, for all jour- 
nals (~ 5500) listed in ISI Journal Citation Report (Science 
edition) for the years 1997-1999, fit by a log-normal curve [in 
red). The inset (from Ref. [3l]]) shows the cumulative prob- 
ability distribution of citations, C[k), against the number of 
citations, k, to all papers published from July 1893 through 
June 2003 in the Physical Review journals, fit by a log-normal 
curve [in red). 
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FIG. 2: Cumulative distribution function of the 740 most 
circulated magazines in UK, fit by a log-normal curve [in 
red). The inset shows the rank-ordered plot of the top 200 
newspapers in USA according to circulation. 



cording to average net circulation per issue in the United 
Kingdom D in 2005. The figure shows an approximately 
log-normal fit; maximum likelihood estimates of param- 
eters for the corresponding distribution are \i = 10.79 
and a — 1.18. Next, we analyzed the circulation fig- 
ures for the top 200 newspapers in the USA for the year 
2005 according to their circulation 6 . Fig. O^inset) shows 
the corresponding rank-ordered plot with an approximate 
power-law fit over a decade yielding Zipf's law, which is 
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FIG. 3: Cumulative probability distribution of the number of 
votes given by registered users of IMDb to movies and TV 
series released or shown between the years 2000-2004, fit by a 
log-normal curve (in red). (Inset) The probability distribution 
of the IMDb rating of a movie, averaged over all the votes 
received. 

supported by the maximum likelihood estimate of the ex- 
ponent for the cumulative probability density function, 
a ~ 1.12. 



5. Movies. 

Movie popularity can be measured in a variety of ways, 
e.g., by looking at the votes given by users of various 
movie-related online forums. One of the largest of such 
forums is the Internet Movie Database (IMDb) 7 that al- 
lows registered users to rate films (and television shows) 
in the range 1-10 (with 1 corresponding to "awful" and 
10 as "excellent"). We looked at the cumulative distribu- 
tion of all votes received by movies or TV series shown 
between 2000-2004 (Fig. E]). The tail of the distribu- 
tion approximately fits a log-normal distribution, with 
maximum likelihood estimates of the corresponding pa- 
rameters, = 8.60 and a — 1.09. Next, we look at 
the distribution of average rating given to these items. 
As the minimum and maximum ratings that an item can 
receive are 1 and 10, respectively, this distribution is nec- 
essarily bounded. The skewed probability distribution of 
the average rating resulting from our analysis is shown 
in Fig. [3] (inset). 

The measures used above have many drawbacks as in- 
dicators of movie popularity, particularly so when they 
are aggregated to produce average values. For example, 
users may judge different movies according to very differ- 
ent information, with so-called classic movies faring very 
differently from recently released movies that have very 
little information available about them. Also, it does not 
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FIG. 4: Cumulative distribution of total gross income for 
movies released across theaters in USA during 2000-2004, fit 
by a log- normal curve (in red). The inset shows the distribu- 
tion of movie income according to the opening weekend gross. 



cost anything to vote for a movie, so that the vital ele- 
ment of competition among movies to become popular is 
missing in this measure. In contrast, looking at the gross 
income distribution of movies that are being shown at 
theaters gives a sense of the relative popularity of movies 
that have roughly equal amount of information available 
about them. Also, this kind of "voting with one's wal- 
let" is a truer indicator of the viewer's movie preferences. 
The freely available datasets about weekly earnings of 
most movies released across theaters in the USA makes 
this a practical exercise. For our study we have concen- 
trated on data from The Movie Times 8 and The Num- 
bers 9 websites for the period 2000-2004. Although total 
gross may be a better measure of movie popularity, the 
opening gross is often thought to signal the success of a 
particular movie. This is supported by the observation 
that about 65-70 % of all movies earn their maximum 
box-office revenue in the first week of release 32]. The 
rank-ordered distribution for the opening, as well as the 
total gross, show an approximate power law with an ex- 
ponent 1/a ~ —1/2 in the region where the top grossing 
movies are located 33]. However, when the data are ag- 
gregated together we find that the distribution (Fig. [4j) is 
better fit by a log-normal 10 (similar to the observation 
of Redner vis-a-vis citations) [341 ] . The maximum likeli- 
hood estimates of the log-normal distribution parameters 
yield /i = 3.49 and a = 1.00. Further, we observe that 
the total gross distribution is just a scaled version of the 
opening distribution, which essentially implies that the 
popularity distribution of movies is decided at the open- 
ing itself. An additional feature of interest is that both 
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We have also verified this for the income distribution of Indian 
movies. 
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FIG. 5: (Left) The total gross (Gt, in dollars) of a movie vs its production budget (in dollars). (Right) The total gross (Gt, 
in dollars) of a movie vs the number of theaters it is released on the opening weekend. 



the opening and the total gross distributions are bimodal 
(Fig. [H inset), implying that most movies either do very 
well or very badly at the box office. 



We have tried to see whether the popularity of indi- 
vidual movies correlate with its production quality (as 
measured by production budget). Fig. [5] (left) shows a 
plot of the total gross vs production budget for a large 
number of movies released between 2000-04 whose bud- 
get exceeded 10 6 $. As is clear from the figure, although 
in general, movies with higher production budget tend to 
earn more, there is no significant correlation (the corre- 
lation coefficient is only 0.62). One can also argue that 
the determination of success of a movie on its opening 
implies the key role of pre-release advertising. Although 
the data for advertising budget is often unavailable, we 
can use as a surrogate, the data about the number of 
theaters that a movie is initially released at, since the 
advertising cost will scale with this quantity. As is ob- 
vious from Fig. [5] (right), the correlation here is worse, 
indicating that advertising has often very little role to 
play in deciding the success or otherwise of a movie in 
becoming popular. In this context, one may note that De 
Vany & Walls have looked at the distribution of movie 
earnings and profit as a function of a variety of variables, 
such as, genre, ratings, presence of stars, etc. and have 
not found any of these to be significant determinants [35| . 



To make a quantitative analysis of the relative perfor- 
mance of movies, we have defined the persistence time r 
of a movie as the time (measured in number of weekends) 
upto which it is being shown at theaters. We observe that 
most movies run for upto about 10 weekends, after which 
there is a steep drop in their survival probability. The 
empirical data seem to fit a Weibull distribution quite 
well. 



6. Websites and Blogs 

Zipf's law for the distribution of requests for pages 
from the web was first reported by Glassman [3(|. By 
tracing web accesses from DEC's Palo Alto facilities, 
10 5 HTTP requests were gathered and the rank-ordered 
distribution of pages was shown to have an exponent 
~ —1. This was supported by a popular article [37j 
which observed Zipf's law when analysing the incoming 
page-requests to a single site (www.sun.com). However, 
subsequent investigation of the page request distribution 
seen by web proxy caches using traces from a variety of 
sources, found the rank-order exponent to vary between 
0.64 to 0.83 [38[ . The deviation from the earlier result 
(showing exact Zipf's law) was ascribed to the fact that 
web accesses at a web server and those at a web proxy 
are different, because the former includes requests from 
all users on the Internet while the latter includes only 
those users from a fixed group. Access statistics for web 
pages have also been analysed by Adamic and Huberman 
from the access logs of about 60000 individual usage logs 
from America Online [39| . The resulting cumulative dis- 
tribution of website popularity, according to the number 
of unique visits to a website by users, showed a power 
law fit with a very close to 1. 

Another obvious measure of webpage popularity is the 
number of links to it from another webpage. Distribu- 
tion of incoming links to a webpage (i.e., URLs point- 
ing to a certain HTML document) for the nd.edu do- 
main, have been shown to obey a power law with ex- 
ponent ~ —2.1 40]. This power law was quantitatively 
confirmed (i.e., the same exponent value of 2.1 was re- 
ported) over a much larger data set involving a web-crawl 
on the entire WWW with 2 x 10 8 webpages and 1.5 x 10 9 
links Al]. While the power law distribution of popu- 
larity of websites according to the number of incoming 
links has been well-established as a power law, among 
web-pages of the same type (e.g., the set of US newspa- 
per homepages) the bulk of the distribution of incoming 
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FIG. 6: Cumulative distribution function for blog popularity 
measured by the number of incoming links a blog receives 
from other weblogs listed in the TTLB Blogosphere ecosystem 
within the past 7-10 days. The curve is the best log-normal 
fit to the data. The inset shows the rank-ordered plot of blog 
popularity according to the number of visits to a blog in a 
single day, in the TTLB ecosystem. 



links from other bloggers, while the latter shows which 
blogs are actually receiving the most readers. The most 
extensive data that we have analyzed comes from the 
TTLB Blogosphere ecosystem 12 that lists 52048 blogs. 
In Fig. [6] we show the CDF for the popularity of blogs 
from this ecology, measured from the number of links 
to that blog seen in the "front page" of other member 
blogs within the past 7-10 days. This can be considered 
a rolling snapshot of the relative popularity of different 
blogs at a particular instant of time. For comparison, we 
also looked at data from two other ecologies, namely, the 
Technorati 13 and the Blogstreet 14 ecosystems, and ob- 
served qualitatively almost identical behavior. The CDF 
(Fig. shows an approximately log-normal fit; maxi- 
mum likelihood estimates of parameters for the corre- 
sponding distribution are /j, = 1.98 and a — 1.51. We 
have also analyzed the popularity of blogs listed in the 
TTLB ecosystem according to traffic, i.e., views per day 
(Fig. [51 inset), which shows a power law over almost two 
decades for the rank-ordered plot. The maximum like- 
lihood estimate of the corresponding exponent for the 
cumulative probability density yields a ~ 0.67. 



links deviates strongly from a power law, exhibiting a 
roughly log- normal shape (42J. 

The finding that the micro-structure of popularity 
within a group is closer to a log-normal distribution has 
created some controversy among researchers involved in 
measuring the popularity distribution of blogs 11 which 
have over the past few years picked up a large following 
all over the web. Shirky [44| had arranged 433 weblogs in 
rank order according to number of incoming links from 
other blogs and had claimed an approximate power law 
distribution. In contrast to this, Drezner & Farrell [43[ 
conducted a study of the incoming link distribution of 
over 4000 blogs dealing almost exclusively with political 
topics, and found the distribution to be much better fit 
by a log-normal than a power law. Other studies have 
made contradictory claims about whether the popularity 
of blogs is better fit by a log-normal or power-law tailed 
distribution [H,|46j]. 

We have also analysed the popularity distribution of 
blogs according to citations in other blogs, using three 
different blogosphere ecologies, i.e., directories of blog 
listings. Such ecologies scan all blogs registered with 
them for (i) the number of links they receive from other 
blogs in their list, as well as (ii) the number of visits to 
that blog. These two measures of popularity complement 
each other, as the former looks at who is getting the most 



7. File Downloads. 

Another web-related measure of popularity is that of 
file downloads. There are numerous file repositories in 
the net which allow visitors to download files either freely 
or for a fee. We focussed on files stored in the MAT- 
LAB Central File Exchange 15 , which are computer pro- 
grams. We looked at the number of downloads of all 
files over a period of one month during early 2006. The 
CDF [Fig. [7] (left)] shows an approximately log-normal 
fit; maximum likelihood estimates of parameters for the 
corresponding distribution are /i = 3.76 and a = 0.89. 

8. Groups. 

A fertile area for observing the distribution of popular- 
ity is in the arena of social groups. While the member- 
ship of clubs, gangs, co-operatives, secret societies, etc., 
are difficult to come by, with the rising popularity of the 
internet it is easy to obtain data for online communities 
such as those in Yahoo 16 or Orkut 17 . By observing the 
memberships of each of the groups in the community that 
a user can join, one can have a quantitative measure of 
the popularity of these groups. An analysis of the Yahoo 
groups resulted in a fat-tailed cumulative distribution of 



A blog or weblog has been defined as a web page with minimal 
to no external editing, providing on-line commentary, periodi- 
cally updated and presented in reverse chronological order, with 
hyperlinks to other online sources [43ll . Blogs can function as 
personal diaries, technical advice columns, sports chat, celebrity 
gossip, political commentary, or all of the above. 



http: / /truthlaidbear.com/ 
http: / /www. technorati. com/ 
http:/ /www. blogstreet. com/ 

http: // www. mathworks . com / matlabcentral /fileexchange / 
http:/ /groups. yahoo. com 
http: / /www. orkut. com 
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FIG. 7: (Left) Cumulative distribution function for the number of downloads of different files in 1 month during early 2006 
from the MATLAB file exchange site. (Right) Cumulative distribution function for the number of members in different Yahoo 
groups under the Business & Finance (squares) and Computers & Internet (diamonds) categories. Groups with less then 5 
members are not considered. For both figures, the curves (in red) are the best log-normal fits to the data. 



the group size [47(. Even though the distribution has a 
significant curvature over the entire range, the tail fits a 
power law for slightly more than a decade, with exponent 
a = 1.8. 

We have recently carried out a smaller-scale study 
of the popularity of Yahoo groups 18 . As in the ear- 
lier study, the popularity of the groups in each category 
has been estimated by the number of group members. 
Fig. [7] (right) looks at the cumulative distributions of the 
group size for two categories, namely Business & Finance 
and Computer & Internet, which comprise 182086 and 
172731 groups respectively. However, unlike the power- 
law reported in the earlier study, we found both the dis- 
tributions to approximately fit a log-normal form, with 
the parameters for the corresponding distributions being 
H = 2.80, a = 2.00 and fi = 3-10, a = 2.05, respectively. 

One can also look at the popularity of individual mem- 
bers of an online group, which has been analysed for a 
different type of community in the web: that formed by 
the users of the Pretty -Good- Privacy (PGP) encryption 
algorithm. To ensure that identities are not forged, users 
certify one another by "signing" the other person's public 
encryption key. In this manner, a directed network (the 
"web of trust" ) is created where the vertices are users and 
links are the user certifications. A measure of popularity 
in this case will be the number of certifications received 
by an user from other users, i.e., the number of incoming 
links for a vertex in the "web of trust" . The in-degree 
cumulative distribution has been reported to be a power 
law with the exponent a ~ 1.8 [48j]. 



9. Elections. 

Political elections are processes that can be viewed as 
contests of popularity between individual candidates, as 
well as parties. The fraction of votes received by candi- 
dates is a direct measure of their popularity, regardless 
of whether the electoral system uses a majority voting 
rule (where the candidate with the largest number of 
votes wins) or a proportional representation (parties get- 
ting representation at the legislative house proportional 
to their fraction of the popular vote). Such studies have 
been carried out for, e.g., the 1998 Brazilian general elec- 
tions , which looked at the fraction of votes received 
by candidates for the positions of state deputies. The 
resulting frequency distribution was fit by a power law 
with exponent very close to —1. The cumulative distri- 
bution, however, revealed that about 90% of the candi- 
dates' votes followed a log-normal distribution, with a 
large dispersion that resulted in the apparent power law. 

We have carried out an analysis of the distribution of 
votes for a number of general elections in Canada and 
India. The data about votes for individual candidates in 
Canada was obtained from the website Elections Canada 
On-line 19 for the general elections held in 1997, 2000, 
2004 and 2006. The total number of candidates in each 
election varied between 1600-1800, there were over ~ 300 
electoral constituencies and the total number of votes 
cast varied around 13 million. Each constituency was di- 
vided into hundreds of polling stations, thereby allowing 
us to obtain a micro-level picture of the popularity of the 
candidates at a particular constituency across the differ- 
ent polling stations. Fig. [5] (left) shows the results of 



The entire Yahoo groups community is divided into 16 categories, 

each of which are then further divided into subcategories. 19 http://www.elections.ca/ 
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FIG. 8: Canadian elections: (Left) The rank-ordered plot of candidate popularity measured by the fraction of votes received 
by him or her, for four successive general elections. The inset shows the cumulative frequency distribution function for this 
popularity measure. Note the region of linear decay in the middle of the curve. (Right) Cumulative probability distribution 
function for the fraction of votes received by a candidate for all constituencies in the 2000 general election. The inset shows 
the cumulative distribution function of the vote fraction for candidates for all polling booths at each constituency in the above 
election. Note that a constituency can have hundreds of polling booths. 



our analysis, indicating an exponential decay of the tail 
of the popularity distribution for all the elections being 
considered. The results don't change even if we consider 
the number of votes, rather than the vote fraction. Fig. 
[S] (right) shows that the distribution of popularity across 
polling stations has almost an identical distribution to 
that seen over the larger scale of electoral constituencies. 
Note that we did not observe the popularity of parties for 
Canada, as the total number of parties were only about 
10. 

Next, we looked at the corresponding data for the 2004 
general elections in India obtained from the website of 
the Election Commission of India 20 . The total number 
of candidates is 5435, about half of whom belonged to 
230 registered parties, who contested from a total of 543 
electoral constituencies, while the total number of votes 
cast was about 400 million. Fig. [5] (left) shows that the 
rank-ordered popularity (measured by the vote fraction) 
distribution for candidates in an Indian general election 
is qualitatively similar to that of Canada, except for the 
presence of a kink indicative of the bimodal nature of the 
distribution. This implies that candidates either receive 
most of the votes cast by electors in that constituency 
or very few votes. It maybe due to the very large num- 
ber of independent candidates (i.e., without affiliation to 
any recognized party) in Indian elections compared to 
Canada. This is supported by our analysis of popularity 
of recognized political parties [Fig. [5] (right)] that shows 
an exponential decay at the tail. Note that the popular- 
ity of a party is measured by the total votes received by 
a party divided by the number of constituencies in which 



it contested. This is same (upto a scaling constant) as 
the percentage of votes received by candidates belonging 
to a party, averaged over all the constituencies in which 
the party had fielded candidates. 



10. Books. 

An obvious popularity distribution based on product 
sales is that of books, especially in view of the record- 
breaking sales in recent times of the Harry Potter series 
of books. However, the lack of freely available data about 
exact sales figures has so far prevented detailed analysis 
of book popularity. It was reported in a recent paper [50| , 
that the cumulative distribution of book sales from the 
online bookseller Amazon 21 has a power-law tail with 
a 2. However, one should note that Amazon does not 
reveal exact sales figures, but rather only the rank ac- 
cording to sales; therefore, this distribution was actually 
based on a heuristic relation between rank and sales pro- 
posed by Rosenthal [5l| . Needless to say, this is at best a 
very rough guide to the exact sales figures (e.g., although 
the sale of Harry Potter and the Half-Blood Prince fluctu- 
ated a lot during the few weeks following its publication, 
it remained steady as the top ranked book in Amazon) 
and is likely to yield misleading distribution of sales. A 
more reliable dataset, if somewhat old, has been com- 
piled by Hackett [12] for the total number of copies sold 
in USA of the top 633 bestselling books between 1895 and 
1965. Newman [7| has reported the maximum likelihood 
estimate for the exponent of the power law fit to this data 



20 http://www.eci.gov.in/ 21 http://www.amazon.com 
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FIG. 9: Indian election: (Left) The rank-ordered plot of candidate popularity measured by the fraction of votes received, for 
the 2004 Lok Sabha election. The inset shows the frequency distribution of the vote fraction, clearly indicating a bimodal 
nature with candidates receiving either most of the votes cast or very few. (Right) Cumulative probability distribution function 
of party popularity for the 2004 election, measured by the fraction of votes received by candidates from that party, over all the 
constituencies it contested in. 



as a ~ 2.51. Fig. [TU] (left) shows the rank-ordered plot 
of this data, indicating an approximate power law fit for 
slightly more than a decade, with an exponent of —0.4. 



11. Language. 

Fig. [TU] (right) shows the cumulative distribution of the 
first-language speaker population for different languages 
around the world. The data has been obtained from Eth- 
nologue 22 which provides the number of first-language 
speakers (over all countries in the world) wherever pos- 
sible. Out of a total of 7299 languages listed in its 15th 
edition, we have considered above 6650 languages for 
which information about the number of speakers is avail- 
able. The figure shows a long tail with an approximately 
log-normal fit; maximum likelihood estimates of param- 
eters for the corresponding distribution are fi = 8.78 and 
cr = 3.17. Note that this kind of popularity distribution 
is different from the others we have discussed so far as the 
speakers are not really free to choose their first language; 
rather this is connected to the population growth rate of 
a particular linguistic community. A similar kind of pop- 
ularity distribution is that for family names, which has 
been analysed by Miyazima et al [531 ] for Japanese family 
names and Newman 7] for American family names, both 
reporting cumulative distribution functions with power 
law tails having a close to 1. However, for Korean family 
names [5_4j the distribution was reported to be exponen- 
tially decaying. 



12. Other Popularity distributions. 



Unlike the distribution of family names discussed 
above, the frequency of occurrence of given names (or 
first names) are indeed subject to waves of popularity, 
with certain names appearing to be very common at a 
particular period. A recent study [55[ has looked at the 
distribution of most popular given names in England and 
Wales over the past millennium, and has claimed a long- 
tailed distribution for the same. Another popularity dis- 
tribution is that of tourist destinations, as measured by 
the number of tourist arrivals over a time period. A 
study [56J that has ranked 89 countries, focussing on the 
period 1980-1990, have found evidence for a log-normal 
distribution as the best fit to the data. 

The occurrence of superstars (i.e., extremely successful 
performers) in popular music has led to a relatively large 
amount of literature b y ec onomists on the occurrence of 
popularity [57], [H, [59ll60j . Chung & Cox have used the 
number of gold-records by performers as the measure of 
their artistic success, and found the tail of this popularity 
distribution to approximately follow a power law [6l| . 
Another study [62J looked at the longevity of music bands 
in the list of Top 75 best-selling recordings, and observed 
a stretched exponential distribution 23 . However, a more 
recent study [63j has shown the survival probability of a 
music recording on the Billboard Hot 100 chart to be fit 
better by the log-logistic distribution. 



http:/ /www. ethnologue.com/ 



While the term stretched exponential distribution is quite com- 
mon in the physics literature, we observe that in other scientific 
fields it is more commonly referred to as Weibull distribution. 
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FIG. 10: (Left) The rank-ordered plot of bestselling books (that sold 2 million copies or more) according to the number of 
copies sold in USA between 1895 to 1965. Adapted from Ref. 7], data provided by M. E. J. Newman. (Right) Cumulative 
distribution function for the size of the population of first-language speakers for over 6650 languages. The data was obtained 
from Ethnologue. The curve (in red) indicates the best log-normal fit to the data. 
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the potential audience. Unlike the overall gross that de- 
cays exponentially with time, the gross per theater shows 
a power-law decay in time with exponent (3 ~ — 1 [6o| . 
This has a striking similarity with the time-evolution of 
popularity for scientific papers in terms of citations. It 
has been reported that the citation probability to a paper 
published t years ago, decays approximately as 1/t [641 ] 
[Fig. QT] (inset)]. Note that, Price [H[ had also noted 
a similar behavior for the decay of citations to papers 
listed in the Science Citation Index. In a very different 
context, namely, the decay in the popularity of a web- 
site (as measured by the rate of download of papers from 
the site) over time t has also been reported to follow an 
inverse power-law, but with a different exponent [6fi|. 



FIG. 11: Weekend gross per theater for a movie (scaled by 
the average weekend gross over its theatrical lifespan) , after it 
has run for W weekends, averaged over the number of movies 
that ran for that long. The initial decline follows a power- 
law with exponent j3 ~ — 1 (th e fit is shown by the broken 
line). The inset (from Ref. [64(1 ) shows the probability that 
a paper will be cited t years after publication in a Physical 
Review journal, in the years 1952 and 1972, as well as over 
the period 1932-1982. Over the range of 2 — 20 years the 
integrated data is consistent with a power law decay having 
an exponent —0.94 (broken line in red). 



B. Time-evolution of popularity 

Here we look briefly at how popularity evolves over 
time. For movies, we look at the gross income per the- 
ater over time (Fig. [TTj) . This is a better measure of the 
dynamics of movie popularity than the time-evolution of 
the weekly overall gross income, because a movie that 
is being shown in a large number of theaters has a big- 
ger income simply on account of higher accessibility for 



C. Discussion 

The selection of (mostly) long-tailed empirical popu- 
larity distributions presented above underlines the fol- 
lowing broad features of such distributions: (i) the entire 
distribution seem to be fit by a log-normal curve (in the 
few cases where the entire distribution is not available, 
the upper tail seems to fit a power law with character- 
istic exponent a which is often close to 1, corresponding 
to the exact form of Zipf's law); (ii) in some cases the 
distribution shows a bimodal character, with most of the 
instances occurring at the two ends of the distribution; 
(iii) the decay of popularity in some cases seem to show 
a simple power law decay, declining inversely with time 
elapsed since release; (iv) the persistence time at high 
levels of popularity show a Weibull distribution in many 
instances. 

The first of these features may come somewhat as a sur- 
prise, because for many popularity distributions, power 
law tails have been reported with various exponents, of- 
ten significantly different from 1. However, we observe 
that very often log-normal distributions have been mis- 
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FIG. 12: A schematic diagram of the emergence of popularity 
as a relation between agents and objects (products or ideas). 

takenly identified as having power law tails. In fact this 
is a very common error, especially if the variance of the 
log-normal distribution is sufficiently large. To see this, 
note that the log-normal distribution, 

P( X ) = —}— e -(^-rf/^\ ( 5 ) 

xay'2'K 

can be written as (on taking logarithm on both sides), 

InP(*)=-^ + (^-l)-]nv^Fa-^ > (6) 

which is a quadratic curve in a doubly logarithmic plot. 
However, a sufficiently small part of the curve will appear 
as a straight line, with the slope depending on which seg- 
ment of the curve one is focussing attention p?l. [67j .Xhis 
is the origin of most of the power law tails with expo- 
nent a / 1 that has been reported in the literature on 
popularity distributions. 

III. MODELS OF POPULARITY 
DISTRIBUTION 

From the perspective of physics, popularity can be 
viewed as an emergent outcome of the collective decision 
process in a society of individual agents exercising their 
free will (as reflected in their individual preferences) to 
choose between alternative products or ideas (Fig. [T2")) . 
In a system without authoritarian control, agents differ 
in their personal preferences which are determined by the 
information available to the agent about the possible al- 
ternatives. However, in any real-life scenario with uneven 
access to information, a seemingly well-informed a gen t 
may influence the choice of several other agents [6a |. 
Thus, the emergence of a popular product is a result of 
the self-organized coordination of choices made by het- 
erogeneous entities. 

The simplest model of collective choice is one where 
the agents decide independently of each other and select 
alternatives at random with a one-step decision process. 
It is easy to see that the possible alternatives will not be 
significantly different in terms of popularity from each 
other. In particular, the popularity distribution aris- 
ing from such a process will not have long tails. There 
are two possible alternative modifications of this simple 
model that will allow it to generate distributions similar 
to the ones seen empirically. The first option is to al- 
low interactions between agents where the choice of one 



agent can influence that of another. While this is often 
true in real-life, we also observe long-tailed distributions 
much before the interaction among agents (and the re- 
sulting dissemination of information) has had a chance 
to influence the popularity. For example, the long-tailed 
distribution of movie popularity, in terms of gross earn- 
ing, is seen at the opening weekend itself, long before po- 
tential movie viewers have had a chance to be influenced 
by other moviegoers. The second option for generating 
realistic popularity distribution gets around this prob- 
lem: here we replace the single-step decision process by 
one comprising of multiple sub-decisions (as there may 
be many factors involved in making a particular deci- 
sion) , each of which contribute to the overall decision to 
purchase a particular product. Therefore, the probabil- 
ity of any particular entity achieving a particular degree 
of popularity can be expressed as the product of prob- 
abilities of each of the underlying factors satisfying the 
required condition to make an agent opt for that entity. 
As is easily seen, the resultant distribution arising from 
such a multiplicative stochastic process has a log-normal 
form, agreeing with many of the empirically observed dis- 
tributions 24 . 

While the bulk of the popularity distributions, showing 
a log-normal nature, can therefore be plausibly explained 
as the product of the multiplicative stochastic structure 
underlying even apparently simple decision processes, 
this would still leave unanswered the reason for the wide 
occurrence of Zipf 's law in other instances. We now turn 
to the first option for extending the simple model out- 
lined above, i.e., investigating the influence of an agent's 
choice behavior on other agents. It turns out there have 
been many proposed mechanisms to explain the ubiquity 
of power-law tailed distributions employing interactions. 
However, from the point of view of the present paper, 
the most relevant (and general) model seems to be the 
Yule process (69|, as modified by Simon [7(|. This is es- 
sentially a cumulative advantage process by which the 
relatively more popular entities get even more popular 
by virtue of being more well-known. 

The Yule-Simon process can be described as follows: 
Suppose initially there are n agents, each of whom are 
free to choose one of a number of products. Subse- 
quently, the number of agents is augmented by unity at 
each time step. At any point in time, when the total 
number of agents is m, the number of distinct products, 
each of which have been chosen by k agents is denoted 



One can argue that the probability distribution of collective 
choice may also reflect the distribution of quality amongst vari- 
ous competing entities; however, in this case the popularity dis- 
tribution would be essentially identical to the quality distribu- 
tion, which a priori can follow any arbitrary distribution. The 
universality of long-tailed popularity distributions and the seem- 
ing absence of any correlation between popularity and quality 
(when it can be measured in any well-defined manner) would 
argue against this hypothesis. 
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by f(k,m). Then, given that, (i) there is a constant 
probability, 7, that an agent chooses a completely new 
product (i.e., one that has not been chosen before by any 
of the agents) and (ii) the probability of choosing a prod- 
uct that has already been chosen by fc agents is propor- 
tional to kf(k,n), one obtains an asymptotic popularity 
distribution that has a power-law tail 25 with exponent 
a = j— ^ . If the appearance of a new product is relatively 
infrequent, i.e., 7 is extremely small, then the exponent 
a ~ 1 (i.e., Zipf's law). 

Another feature of popularity distributions that has 
been mentioned earlier is that, in some cases, they ap- 
pear to have a bimodal nature. We now present a simple 
agent-based model [73[ that shows how bimodal and uni- 
modal distributions of popularity can arise very simply 
through agents interacting with each other, and reacting 
to information about what the majority are choosing in 
the previous time step. 



A. A Model for Bimodal Distribution of Collective 
Choice 

We have already discussed the simplest model of col- 
lective choice in which individual agents make completely 
independent decisions. For binary choice (i.e., each agent 
can only choose between two options) the emergence of 
collective choice is equivalent to a one-dimensional ran- 
dom walk with the number of steps equal to the number 
of agents. Therefore, the outcome will be normally dis- 
tributed, with the most probable outcome being an equal 
number of agents choosing each alternative. While such 
unimodal distributions of popularity are indeed observed 
in some situations, as mentioned earlier in this article 
many real-life examples show the occurrence of bimodal 
distributions indicative of highly polarized choice behav- 
ior among agents resulting in the emergence of a highly 
popular product. This polarization suggests that agents 
not only opt for certain choices based on their personal 
preferences, but are also influenced by other agents in 
their social neighborhood. Also, the personal preferences 
may themselves change over time as a result of the out- 
come of previous choices, e.g., whether or not their choice 
agreed with that of the majority. This latter effect is an 
example of global feedback process that we think is cru- 
cial in the occurrence of bimodal behavior. 

We now present a general model of collective decision 
that shows how polarization in the presence of individual 
choice volatility can be achieved with an adaptation and 
learning dynamics of the personal preference. In this 
model, the choice of individual agents are not only af- 
fected by those of their neighbors, but, in addition, their 
preference is modified by their previous choice as well as 



Note that, the models of Price [29| . Barabasi- Albert [7lj and 
Redner [72j are all special cases of this general mechanism. 



information about how successful their previous choice 
behavior was in coordinating with that of the majority. 
Here it is assumed that information about the intrin- 
sic quality of the alternative products is inaccessible to 
the agent, who takes the cue from what the majority is 
choosing to decide which one is the "better choice" . Ex- 
amples of such limited global information about the ma- 
jority's preference available to an agent are the results of 
consumer surveys and publicity campaigns disseminated 
through the mass media. 

The simplest, binary choice version of our model is de- 
fined as follows. Consider a population of N agents, each 
of whom can be in one of two choice states S = ±1 (e.g., 
to buy or not to buy a certain product, to vote Party A 
or Party B, etc.). In addition, each agent has an individ- 
ual preference, 9, that is chosen from a uniform random 
distribution initially. At each time step, every agent con- 
siders the average choice of its neighbors at the previous 
instant, and if this exceeds its personal preference, makes 
the same choice; otherwise, it makes the opposite choice. 
Then, for the i-th agent, the choice dynamics is described 
by: 

S* +1 =sign(£ Ji^-et), (7) 
ieN 

where sign (x) = +1, if x > 0, and = —1, otherwise. 
The coupling coefficient among agents, Jjj, is assumed 
to be a constant (= 1) for simplicity and normalized by 
z (= \N\), the number of neighbors. In a lattice, M is the 
set of spatial nearest neighbors and z is the coordination 
number, while in the mean field approximation, J\f is the 
set of all other agents in the system and z = N — 1. 
The individual preference, 9, evolves over time as: 

= 9\ + + XSj, if Sj ± sign(Af t ), 

= 9\ +/iS- +1 , otherwise, (8) 

where M* = (1/N)^2j Sj is the collective decision of 
the entire community at time t. Adjustment to previ- 
ous choice is governed by the adaptation rate fi in the 
second term on the right-hand side of Eq. (JSJ) , while the 
third term, governed by the learning rate A, represents 
the correction when the individual choice does not agree 
with that of the majority at the previous instant. The de- 
sirability of a particular choice is assumed to be related to 
the fraction of the community choosing it; hence, at any 
given time, every agent is trying to coordinate its choice 
with that of the majority. Note that, for /i = 0, A = 0, 
the model reduces to the well-known zero-temperature, 
random field Ising model (RFIM). 

Random neighbor and mean field model. For math- 
ematical convenience, we choose the z neighbors of an 
agent at random from the N — 1 other agents in the sys- 
tem. We also assume this randomness to be "annealed" , 
i.e., the next time the same agent interacts with z other 
agents, they are chosen at random anew. Thus, by ig- 
noring spatial correlations, a mean field approximation 
is achieved. 
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FIG. 13: (Left) The spatial pattern of choice (S) in the absence of learning (A = 0) in a two-dimensional square lattice of 
1000 x 1000 agents after 500 iterations starting from a random configuration. The figure is a magnified view of the the central 
100 x 100 region showing the absence of long-range correlation among the agents. (Right) The spatial pattern of choice (5*) 
with learning (A = 0.05) in the same system, with a majority of agents now in the choice state S = +1. The magnified view of 
the central 100 x 100 region shows coarsening of regions having agents aligned in the same choice state. 



For z = N — 1, i.e., when every agent has the infor- 
mation about the entire system, it is easy to see that, 
in the absence of learning (A = 0), the collective de- 
cision M follows the evolution equation rule: M t+1 — 
sign[(l - /i)M* - [i Y^r Ji MT ]- For < /x < 1, the system 
alternates between the ordered states M = ±1 with a pe- 
riod ~ 4//i. The residence time at any one state (~ 2//x) 
diverges with decreasing /i, and for /j, = 0, the system re- 
mains fixed at one of the ordered states corresponding to 
M = ±1, as expected from RFIM results. At [i = 1, the 
system remains in the disordered state, so that M = 0. 
Therefore, we see a transition from a bimodal distribu- 
tion of the collective decision, M, with peaks at non-zero 
values, to an unimodal distribution of M centered about 
0, at n c — 1. When we introduce learning, so that A > 0, 
the agents try to coordinate with each other and at the 
limit A — > co it is easy to see that Si = sign(Af) for all i, 
so that all the agents make identical choice. In the simula- 
tions, we note that the bimodal distribution is recovered 
for jj, = 1 when A > 1. 

For finite values of z, the population is no longer "well- 
mixed" and the mean-field approximation becomes less 
accurate the lower z is. For z « N, the critical value of 
H at which the transition from a bimodal to a unimodal 
distribution occurs in the absence of learning, fi c < 1. For 
example, fi c = for z = 2, while it is 3/4 for z = 4. As 
z increases /i c quickly converges to the mean-field value, 
fi c = 1. On introducing learning (A > 0) for fi > fj, c , we 
again notice a transition to an ordered state, with more 
and more agents coordinating their choice. 

Lattice. To implement the model when the neighbors 
are spatially related, we consider d-dimensional lattices 
(d = 1, 2, 3) and study the dynamics numerically. We re- 
port results obtained in systems with absorbing bound- 
ary conditions; using periodic boundary conditions leads 
to minor changes but the overall qualitative results re- 
main the same. It is worth noting that the adaptation 



term disrupts the ordering expected from results of the 
RFIM for d = 3, so that for any non-zero \i the system 
is in a disordered state when A = 0. 

In the absence of learning (A = 0), starting from a 
initial random distribution of choices and personal pref- 
erences, we observe only very small clusters of similar 
choice behavior [Fig. [13] (left)] and the average choice M 
fluctuates around 0. In other words, at any given time 
an equal number (on average) of agents have opposite 
choice preferences. Introduction of learning in the model 
(A > 0) gives rise to significant clustering as well as a 
non-zero value for the collective choice M. We find that 
the probability distribution of M [Fig. [TJ] (left)] evolves 
from a single peak at 0, to a bimodal distribution as A 
increases from 0. This is similar to second-order phase 
transition in systems undergoing qualitative changes at a 
critical threshold. The collective decision M switches pe- 
riodically from a positive value to a negative value having 
an average residence time which diverges with A and with 
N . For /i > A > 0, large clusters of agents with identical 
choice are observed to form and dissipate throughout the 
lattice [Fig. [T3l (right)]. After sufficiently long times, we 
observe the emergence of structured patterns having the 
symmetry of the underlying lattice, with the behavior of 
agents belonging to a particular structure being highly 
correlated. Note that these patterns are dynamic, being 
essentially concentric waves that emerge at the center 
and travel to the boundary of the region, which continu- 
ally expands until it meets another such pattern. Where 
two patterns meet their progress is arrested and their 
common boundary resembles a dislocation line. In the 
asymptotic limit, several such patterns fill up the entire 
system. These patterns indicate the growth of clusters 
with strictly correlated choice behavior. The central site 
in these clusters act as the "opinion leader" for the entire 
group. This can be seen as analogous to the formation 
of "cultural groups" with shared preferences [74|. It is 
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FIG. 14: (Left) The probability distribution of the collective decision M in a two-dimensional square lattice of 100 x 100 agents. 
The adaptation rate /i = 0.1, and the learning rate A is increased from to 0.1 to show the transition from unimodal to bimodal 
behavior. The system was simulated for 5 x 10 4 iterations to obtain the distribution. (Right) The order parameter < \M\ > for 
one- and two-dimensional lattices. The adaptation rate is /i = 0.1, while A is increased gradually to show the transition to an 
ordered state. Note that for higher values of fi the two curves are virtually identical. There is very little system size-dependence 
of the curves. 



of interest to note that distributing A from a random 
distribution among the agents disrupts the symmetry of 
the patterns, but we still observe patterns of correlated 
choice behavior. It is the global feedback (A ^ 0) which 
determines the formation of large connected regions of 
agents having similar choice behavior. This is reflected 
in the order parameter, (\M |), where (• • •) indicates time 
averaging. Fig. [TJ] (right) shows the order parameter in- 
creasing with A in both one and two dimensional lattices, 
signifying the transition from a disordered state to an or- 
dered state, where neighboring agents have coordinated 
their choices. 

Our model seems to provide an explanation for the ob- 
served bimodality in a large number of social or economic 
phenomena, e.g., in the distribution of the gross income 
for movies released in theaters across the USA during 
the period 1997-2003 [33J. Bimodality in this context 
implies that movies either achieve enormous success or 
arc dismal box-office failures. Based on the model pre- 
sented here, we conclude that, in such a situation the 
moviegoers' choice depends not only on their neighbors' 
choice, but also on how well previous action based on such 
neighborhood information agreed with media reports and 
reviews of movies indicating the overall or community 
choice. Hence, the case of A > 0, indicating the reliance 
of an individual agent on the aggregate information, im- 
poses correlation among agent choice across the commu- 
nity which leads to a bimodal gross distribution. 

Based on a study of the rank distribution of movie 
earnings according to their ratings (75| . we further spec- 
ulate that movies made for children (rated G) have a 
significantly different popularity mechanism than those 
made for older audiences (PG, PG-13 and R). The for- 



mer show striking similarity with the rank distribution 
curve obtained for A = 0, while the latter are closer to 
the curves corresponding to A > 0. This agrees with the 
intuitive notion that children are more likely to base their 
choices about movies (or other products, such as toys) on 
the choice of their friends or classmates, while adults are 
more likely to be swayed by reports in mass media about 
the popular appeal of a movie. This suggests that one can 
tailor marketing strategies to different segments of the 
population depending on the role that global feedback 
plays in their decisions. Products whose target market 
has A = can be better disseminated through distribut- 
ing free samples in neighborhoods; while for A > 0, a 
mass media campaign blitz will be more effective. 

IV. CONCLUSIONS 

In this article we have primarily made an attempt to 
ascertain the general empirical features inherent in many 
popularity phenomena. We observe that the distribution 
of popularity in various contexts often exhibit long tails, 
the nature of which seem to be either following a log- 
normal form or a power law with the exponent a ~ 1 
(Zipf's law). While the log- normal distribution would 
arise naturally in any multiplicative stochastic process, 
in the context of popularity it would be natural to inter- 
pret it as a manifestation of the interplay of the multiple 
factors involved in an agent making a decision to adopt 
a particular product or idea. Further, there is no ne- 
cessity for interactions among agents for this particular 
distribution in popularity to be observed. On the other 
hand, distributions with power law tails would seem to 
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necessarily entail inter-agent interactions, e.g., a process 
whereby agents follow the choice of other agents, with 
a particular choice becoming more preferable if many 
more agents opt for it 26 . This is not necessarily an 
irrational "herding" effect; for example, in the case of 
popularity of cities, the larger the population of a city, 
the more likely it is to attract migrants, owing to the 
larger variety of employment opportunities. Thus the 
very fact that more agents have chosen a particular alter- 
native may make that choice more preferable than others. 
Seen in this light, the popularity distribution should show 
a log-normal distribution in situations where individual 
quality preferences play an important role in making a 
choice, while, in cases where the choice of other agents is 
a paramount influence in the decision process of an agent, 
Zipf's law should emerge 27 . In either case, a stochastic 
process is sufficient to generate the popularity distribu- 

26 In the economics literature, this is referred to as positive exter- 
nality [zl 

27 Montroll & Shlesinger ffl\ have shown that a simple extension 
to multiplicative stochastic processes can generate power-law 
tails from a log-normal distribution. Recently, Bhattacharyya 
et al [78l | have also proposed a very simple model showing the 



tions seen in reality. This suggests that the emergence of 
popularity can be explained entirely as an outcome of a 
sequence of chance events. 
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asymptotic emergence of Zipf's law in the presence of random 
interaction among agents; it is interesting in the context of our 
statements here that, if the mean field theoretic arguments used 
in the above paper are extended to the case of no interactions 
amongst agents, they would suggest a log-normal distribution. 
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